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Multitype branching processes with immigration in one type are 
used to model the dynamics of stage-structured plant populations. 
Parametric inference is first carried out when count data of all types 
are observed. Statistical identifiability is proved together with deriva- 
tion of consistent and asymptotically Gaussian estimators for all the 
parameters ruling the population dynamics model. However, for many 
ecological data, some stages (i.e. types) cannot be observed in prac- 
tice. We study which mechanisms can still be estimated given the 
model and the data available in this context. Parametric inference 
is investigated in the case of Poisson distributions. We prove that 
identifiability holds for only a subset of the parameter set depend- 
ing on the number of generations observed, together with consistent 
and asymptotic properties of estimators. Finally, simulations are per- 
formed to study the behaviour of the estimators when the model is 
no longer Poisson. Quite good results are obtained for a large class of 
models with distributions having mean and variance within the same 
order of magnitude, leading to some stability results with respect to 
the Poisson assumption. 



1. Introduction. Understanding population dynamics requires mod- 
els that admit the complexity of natural populations and the data ecologists 
can get from them. Thus analyzing ecological data raises questions ranging 
from modeling purposes to statistical inference. Among various methods, 
Leslie matrices or demographic matrix models are widely used for studying 
the dynamics of age or stage-structured populations (e.g. Caswell, 2001). 
These models are deterministic with noise added to introduce some vari- 
ability. In many cases however and especially for small populations, the 
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demographic stochasticity has to be taken into account; these models are 
too simple (Melbourne and Hastings, 2008) and can no longer be used, even 
adding stochasticity into the dynamics with random effects and covariates 
(Royle, 2004; Barry et al., 2003) or Bayesian approaches (Raftery et al., 
1995; Gross et al., 2002; Clark and Bjornstad, 2004). For these reasons, we 
use here stochastic models to study small populations dynamics. 

The starting point of this work is a three-year field survey of feral popula- 
tions (i.e. populations escaped from crops) of an annual crop species (oilseed 
rape) that was carrried out in the center of France (Selommes, Loir-et-Cher; 
Gamier, 2006). Unlike cultivated oilseed rape, very few facts are known 
about the dynamics of feral oilseed rape populations. In this study, the dy- 
namics is modeled by a multitype branching process (five types including 
vegetative and reproductive plant stages along with seeds in the soil seed- 
bank) with immigration in one type (seeds). Data consisted in populations 
counts in each type, except the seeds that could not be observed. Three 
main difficulties occur when studying this demographic dataset. (1) A large 
number of populations {K = 300) have been observed over a short period 
of time (n = 2,3); (2) only count data have been collected; (3) some types 
could not be observed by ecologists (here seeds). These characteristics are 
clearly not specific to this survey and are frequently met in data coming 
from Population Genetics and Ecology (see e.g. de Valpine, 2004 and the 
references therein). These data could be studied as longitudinal data, but 
for concerns about the dynamics, better insights can be obtained by means 
of mechanistic models describing it. 

Branching processes have largely been studied (see Athreya and Ney, 1972 
for a general presentation; Mode, 1971 for Multitype Branching Processes 
and Haccou et al., 2005; Kimmel and Axelrod, 2002; Mode and Sleeman, 
2000 for applications in biology). Statistical inference has also been largely 
investigated (Hall and Heyde, 1980; Guttorp, 1991 for general branching 
processes; Wei and Winnicki, 1990; Winnicki, 1991 for branching processes 
with immigration; Bhat and Adke, 1981; Maaouia and Touati, 2005; Gonzalez et al., 
2008 for multitype branching processes). However, the precise multitype 
branching process with immigration used here is a combination of the previ- 
ous ones and moreover statistical inference for multitype branching processes 
is usually based on the following different observations (Maaouia and Touati, 
2005; Gonzalez et al., 2008): the number of descendants of type j coming 
from all the type i individuals. We just observed the successive counts of 
"individuals" of each type. This is more realistic assumption since this sit- 
uation frequently occurs with datasets from field studies, and inference is 
studied here in this framework. 
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We are interested in the estimation of the parameters involved in the 
population dynamics from the incomplete observations of count data col- 
lected simultaneously in several populations. This is an "Incomplete Data 
model", or "State Space model" (as defined for instance in (Cappe et al., 
2005). It is also an inverse problem and a central theme in Ecology arising 
from its study is that parameters might not be identifiable knowing only 
the population dynamics (Wood, 1997). In practice, the inference based 
on such data is performed using various E.M. algorithms eventually coupled 
with Monte Carlo methods (Dempster et al., 1977; Kuhn and Lavielle, 2004; 
McLachlan and Krishnan, 2007; Sung and Geyer, 2007; Olsson et al., 2008), 
and Bayesian or Hierarchical Bayesian methods (Clark and Bjornstad, 2004; 
Buckland et al., 2004; Thomas et al., 2005). All these methods circumvent 
but cannot address the identifiability problem. However, identifiability is a 
prerequisite of statistical inference, and understanding the dynamics mech- 
anisms strongly relies on how parameters are linked in the identifiability 
question. We propose here an integrated framework in order to analyze as 
accurately as possible the whole data set of the field survey. Introduction of 
covariates and a priori knowledge, errors coming from non exhaustive pop- 
ulation samplings within some populations, use of various algorithms rely 
on this work and are studied in two companion papers (David et al., 2008; 
Gamier et al., 2008). 

The paper is organized as follows. Section 2 contains the description of the 
population dynamics and preliminary results (Proposition 2.1). The statisti- 
cal inference for complete observations is studied in Section 3. We first prove 
identifiability for all the parameters and derive consistent and asymptotically 
Gaussian estimators (Proposition 3.1). Since seeds are not observed in prac- 
tice, the problem of unobserved types is addressed in Section 4. This is a non 
linear non Gaussian state space model. The associated three-dimensional 
stochastic process is no longer Markovian. The model with Poisson distri- 
butions provides a useful example with explicit computations. We obtain 
a closed form of the dependence of present observations on the whole past 
for the three-dimensional process (Theorem 4.1). A question concerns the 
statistical model identifiability : it is studied according to the number of 
observed generations. We characterize the parameter subset where identifia- 
bility holds (Theorem 4.2) and study the parametric inference (Proposition 
4.2). Section 5 presents simulation results to study how the estimation per- 
forms with respect to deviations to the Poisson model. Detailed proofs are 
given in the Appendix. 

2. Model and preliminary results. 
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2.1. Dynamics of annual plants. We consider annual plants with the 
following life cycle. Seeds are released at the end of summer; they can ei- 
ther enter in a seed bank if buried or germinate in autumn. The emerged 
rosettes vernalize during winter, then bolt in spring and finally produce ma- 
ture plants that shed seeds in summer and then die. Five developmental 
stages are considered: rosettes before winter R, rosettes after winter (ver- 
nalized rosettes) V, mature plants F, seeds located in the soil seed bank 
("old seeds") S, and seeds located on the soil surface ("new seeds") T. New 
seeds and old seeds are separated because they have different demographic 
parameters. Within each cycle, new seeds can enter these populations at the 
end of summer (immigration). There exist two sources of seed immigration, 
seeds from adjacent mature crops and seeds from spillage during seed trans- 
port (Crawley and Brown, 1995; Claessen et al., 2005a; Gamier et al., 2006; 
Pivard et al., 2008). 




Fig 1. Schematic Dynamics of feral oilseed rape populations. 
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This model is quite general and for dynamical purposes only, it could be 
simplified considering just seeds and mature plants. However, our concern 
is different since we aim at estimating as many parameters as possible given 
both the model and the data available. Keeping a five-type model allows 
using all the data collected in the field survey and thus leads to the best 
description of the plant population dynamics for inference. 

2.2. Notations and assumptions. Consider first one population. From 
now on, the term year corresponds to one life cycle. It starts with the birth 
of the new seeds and ends just before the birth of the new seeds of the next 
generation. All the variables are integer random variables indexed by i £ N. 
For year i, denote by Si the number of "old seeds", Tj the number of "new 
seeds" , Ri the number of rosettes before winter, Vi the number of rosettes 
after winter and F{ the number of mature plants. Six parameters describe 
these transitions: 



(2.1) (c, d, a, b, a,b') e (0, l) 6 with < a + b < 1, < a' + b' < 1. 



(2.3) P(seed in T { -» S i+1 ) = a' ; P(seed in T { -► Ri) = b' 

(2.4) P(rosette in R i ^V i )= c ; P(rosette in Vi -> F») = d. 

Mature plants in Fj produce "new seeds" T- +1 according to the offspring 
distribution G(.). 

A number I{ of "new seeds" immigrate into the population at the begin- 
ning of year i; it is assumed to follow the distribution //(.). Seeds in Si come 
from two sources, the "old seeds" Si-i and the "new seeds" T,,„i. Denote 
by Si and S" these two quantities. Similarly, rosettes before winter Ri come 
from "old seeds" in Si and "new seeds" in Tj. Denote by R\ and R'( the 
rosettes coming from Si and Tj. These variables satisfy : 



(2.2) 



P(seed in Si — > <Si+i) 



a ; P(seed in Si 



Ri) = b 



(2.5) 



Si — S'i + S'i , Ri — R'i + R'l 



■i ■ 



(2.6) 



T i = T' l +I i . 



Note that the probabilities of dying for stages S, T, R, V are respectively 
(1 — a — b), (1 — a' — b'), (1 — c), (1 — d) and that the probability of no 
offspring for a mature plant is G?(0). 
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Let us now detail the framework and assumptions used in the sequel. The 
field survey consisted of a large number of feral oilseed rape populations 
(around 300 in Gamier, 2006) observed over a short period of time (n = 
2 or 3). These populations were isolated, so we assume here independence 
for these populations and the plant density was low in the surveyed popu- 
lations so that density-dependence in plant survival and reproduction could 
be neglected (Pivard et al., 2008; Gamier, 2006). Moreover, we assume 

Assumption 1. Offspring distribution G(.) 
All mature plants reproduce independently according to the same offspring 
distribution G{.) with expectation m and variance S 2 . 

Assumption 2. There is no competition in survival and germination 
between seeds in the seed bank or "old seeds" Si and seeds on the soil or 
"new seeds" Ti. 

Assumption 3. Immigration distribution //(.) 

(i) Immigration ij is independent of seed bank seeds Si, offspring seeds T[ 
and of previous years. 

(ii) The random variables (Ii,i € N) are independent and identically dis- 
tributed according to /i(.) with expectation u and variance p 2 . 

The above assumptions are twofold: "independence" and "identically dis- 
tributed" . They do not have the same status : while releasing the "indepen- 
dence " assumption is quite difficult, the "identically distributed " assump- 
tion is done here for sake of clarity. The independence assumption in (Al) 
is justified because of the low plant density. There is no biological back- 
ground for considering competition between the evolution of "old seeds" 
Si and "new seeds" T[ leading to (A2). In the field survey of feral oilseed 
rape populations, seed immigration mainly occured from spillage during seed 
transport and from adjacent mature crops, leading to the independence as- 
sumptions in (A3) (Pivard et al., 2008; Gamier, 2006). Adding covariates or 
a priori information can easily be introduced within this framework, which 
amounts to remove the "identically distributed" assumption. This indeed 
has been done for the statistical analysis on the whole data set: using that 
many populations had been observed, covariates, a priori knowledge, and a 
dependence with respect to i, k of the parameters defined in (2.2)-(2.4) have 
been added (Gamier et al., 2008). The framework detailed here is therefore 
quite general. 
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2.3. Preliminary results. From now on, time i is associated with a com- 
plete cycle of a plant. Let us still consider one population. The model is a 
discrete time stochastic process (Xj) with state space N 5 . Set 

(2.7) Xt = (S h Ti, Ri, Vi,Fi) and T { = a(X , X u . . . , , X t ). 

For x = (s,t,r,v, /), x' = (s',t',r',v', /') G N 5 , denote vro(x) the initial 
distribution and p(x, x') the conditional distribution of X\ given Xq : 

(2.8) vr (x) = ¥(X = x) ; p(a?, x) = f{X x = x'/X = x). 

For the Binomial distribution B(N;p), we write P(Y = A;) = B(N;p)(k); 
Multinomial distributions on N', M(N;p\, . . . ,pi) are simplified omitting 
the last component, which leads for / = 3 and < p\ + P2 < 1, 

(2.9) M(N;p 1 ,p 2 )(n 1 ,n 2 ) = M(N; Pl ,p2,l-Pi-P2){ni,n 2 ,N -m-n 2 ) 

Let * denote the convolution product of two distributions. Using these no- 
tations and (2.1), define the distributions 

(2.10) u(s,t)=F(S Q = s,T Q = t), 

(2 11) mis 1 Is t r) - (M(s;a,b)*M(t;a>,b>))(s>,r) 

(2.11) Pl {s/s,t,r)- ( B (s;b)*B(t;V))(r) 

(2.12) p 2 (t'/f) = (G*? * Ps(r/s, t) = (B(s; b) * B(t; b'))(r), 

(2.13) Pi (v/r)=B(r;c)(v) , p 5 (f /v) = B(v;d)(f). 

Using now definitions (2.7), (2.8) and notations (2.9)- (2.13), the following 
holds. 

Proposition 2.1. Under (Al), (A2), (A3), (Xj)j> is a time homo- 
geneous Markov chain on N 5 with initial distribution ttq(x) and transition 
probabilities p(x, x') satisfying 

(2.14) tt (x) = i/(s, t) Ps{r/s, t) p 4 {v/r) p 5 {f/v) 

(2.15) p(x,x') =pi(s'/s,t,r) p 2 {t'/f) P3(r'/s',t') p A {v /r') p 5 (f'/v'). 

The process (Xj)j>o is also a multitype branching process with immigration 
in one type. 
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The last statement of Proposition 2.1 is immediate since, considering each 
stage of the plant as a type, each plant reproduces independently from the 
others according to the same offspring distribution with values in N 5 . How- 
ever, the two types Ri and Vi have no offspring in the next generation, 
leading to a non positively regular multitype process as defined in Mode 
(1971) or Athreya and Ney (1972). The process (Si,Ti,Fi) is positively reg- 
ular process but, for the reasons stated in 2.1, we prefer keeping here the 
five-dimensional process (Xi). 

The proof of Proposition 2.1 is given in Appendix A.l. 

3. Likelihood and inference for complete observations. We first 
study the case when all types are observed. 

3.1. Notations and statistical framework. We assume in the sequel that 
the initial distribution of (So,Rq), the offspring distribution G(.) and the 
immigration fi(.) belong to the parametric families : 

- distribution of (S ,T ) : (u(9 1 ;s,t), 9 1 £ 6 1 ); 

- offspring G(.)\ (G(9 2 ; .) , 9 2 £ O 2 ) with mean m and variance S 2 ; 

- immigration /x(.) : (/u(# 3 ; .), # 3 £ O 3 ) with mean u and variance p 2 . 

Let us denote by 9 = (c,d,a,b,a' ,b' ,9 1 ,9 2 ,9 3 ) (resp. 9q) an arbitrary 
value (resp. the true value) of the parameter and by O the parameter set. 
We assume : 



Assumption 4. 6 compact set ofR 1 and 9q £ 9. 

For the fcth population, X k = (X^,i £ N) is the Markov chain describing 
the population dynamics and x\ = (sf, rf,vf, ff) are the observations at 
time i. In order to simplify the expressions for the likelihood, we consider 
here that Si, Ti are observed up to time n+1. This has no consequence since 
seeds are not observed in practice. Hence we denote, 



(3.3) X .. n (K) = (Xl n ,...,X« n ), and O 0:n (K) = (£>L, • • • , Og n ). 



Let ttq(9;x) (resp. p(9;x,x')) be the initial distribution (resp. the transition 
probabilities) associated with parameter 9 and Fg the probability distribu- 
tion of (Xi) on the canonical space and Eg the expectation w.r.t. P#. 



o 



(3.2) O:n — (x , x 1 , . . . , : 

The processes A"g :n are repetitions of Xq- 
ing all the populations, define : 





(Xq, . . . , X n , S n+ i,T n+ i). Join- 
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3.2. Likelihood. Computing the likelihood of a Markov process with tran- 
sition probabilities p(6; x, x') is classical : for population k, it has for expres- 
sion, using (2.14), 

n 

(3.4) L(^O o fc J=vr o (0;x o fc )(n^;4-i,^))Pi(4 + i/4)P2(^ + i/4)- 

i=l 

Joining the observations from the K independent populations, we obtain 
using Proposition 2.1, (2.10) - (2.13) 

K 

(3.5) 1(9; O 0:n (K)) = l K (6) = J2 log W 0& n ). 

k=l 

Reordering the terms of (3.4) and (3.5) according to the parameters yields, 

5 

(3.6) mo Q .. n (K)) = Y J V K {e). 

j=0 

The first term deals with the initial distribution of (So, To) 

(3.7) i K (e) = f2^gu(e\s k ,t k ) = i° K (o 1 )- 

k=l 

Gather in the second term the transition from seeds S, T to rosettes R. 

K n 

(3.8) = ^J2 }o ^(shb)*B(t k ;b'))(r!) = l l K (b,b'). 

k=l i=0 

The next two terms contain the transitions from R to V and V to F, they 
write 

K n 

(3.9) &(<?) = X)X>? \ogc+(r k -v k ) log(l-c) + C 2 (O .. n (K)) = l 2 K (c), 

k=l i=0 
K n 

(3.10) l 3 K (9) = J2J2f" l °S d +( v i-f") lo s( 1 -d) + C 3 (Oo:n(K)) = l 3 K (d), 

k=l i=0 

where the two terms C2(Oo :n (K)) and Cs(Oo :n (K)) only depend on the 
observations. Set in the next term the transition from F to T, 

K n 

(3.11) 4W = E E lo § ( G ( 02 ; -y f > * M# 3 ; •)) = 4(0 2 , o 3 ) 

k=l i=0 
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The last term concerns the seeds S : 

m« ,5 rm ff, ;o,») « M«?;« , .«0)(»ln.»f) > 
' ' A ' ( ) = £S 6< (S(^)*S(^))(rf> f 

Joining the two terms containing a, a' , 6, 6' yields 

(3.13) l%(9) = l l K {6) + l K (9) = l K (a, 6, a', b'). 

The terms l° k (9),l 2 K (9),l K (9),l K (9) and ^(0) depend on disjoint sets for 
the parameters. Hence, maximizing the loglikelihood can be performed max- 
imizing separately these five terms. 

Remark 3.1. Usually, statistical inference for Stochastic Processes is 
investigated in the asymptotic framework small K (mostly K = 1) and large 
n (leading to asymptotics results n — > +oo ). Here, we have that n is small 
and K large (e.g. magnitude 300). This situation often occurs in Ecology 

(de Valpine, 2004). 

3.3. Study of maximum likelihood and other estimators. This is a K sam- 
ple of i.i.d. random variables, each variable being a part of a branching pro- 
cess path. We have to use simultaneously the repetitions and the Markov 
structure to estimate the parameters. So, deriving the properties of the sta- 
tistical model is not standard. The various terms 1^(0) of the loglikelihood 
are associated with different parametric inference problems. 

The first term l K (9) deals with the estimation of 9 l based on a sample of 
K i.i.d. random variables with distribution v(9 1 ; .) on N 2 , which is standard. 
The terms 1^(9) = l\{c) and l\{9) = l\{d) are related to the estimation 
of parameters c, d. Denote by cx,dx the maximum likelihood estimators 
(MLE) obtained maximizing l\(c) and l\{d). They depend on the successive 
observations (rf, vf) (resp. (vf, ff)) for {i = 0, . . . , n; k = 1, . . . , K} and are 
explicit (see Appendix A. 2). 

Parameters (a,b,a',b') are only present in l%(9) defined in (3.13). Maxi- 
mum likelihood estimators for (a, 6, a', b') can be defined maximizing l%(9). 
To prove identifiability and consistency, we have rather consider here condi- 
tional least squares (CLS) estimators. Conditionally on (Si, Tj), the marginal 
distribution of Sj+i (resp. Rj) is the sum of the two independent distribu- 
tions, B(Si,a) and B(Ti,a') (resp. B(Si,b) and B(Ti,b')). Therefore, we can 
define the CLS estimators (aK,a' K ) and (ok, b' K ) minimizing the Conditional 
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Least Squares: 

K n 

(3.14) Jk(a,a>) = £ $>ti " " ! 

fe=l i=0 
K n 

(3.15) J|(6, b') = £ " " W- 

fe=l i=0 

The remaining term is /^(# 2 ,# 3 ) = l K (m,u) since we are concerned by 
the estimation of m and u. This is the only part of the likelihood associated 
with the branching mechanism. This likelihood containing the convolution 
product G*f -k fi is untractable, and methods based on conditional least 
squares or weighted conditional least squares are used for branching pro- 
cesses (Hall and Heyde, 1980; Guttorp, 1991; Wei and Winnicki, 1990) lead- 
ing to moment estimations of G and fi. Noting that E(Ti+i/Fi) = mFi + u, 
we can just consider for the estimation of m and u, the CLS process, 

K n 

(3.16) J£(m, «)=£ - mft - uf . 

k=l i=0 

Let {tHkiUk) be the CLS estimators minimizing (3.16). All the above esti- 
mators are explicitly defined in Appendix A. 2 and we can state : 

PROPOSITION 3.1. Assume (Al), (A2), (A3) and (A4). Then, under 
Pq , all the parameters c,d,a,b,a',b',m,u are identifiable and, as K — ► oo, 

(i) (&k, dx, ax, bx, a' k, &k, mx, uk) are consistent and asymptotically Gaus- 
sian at rate \fK; 

(ii) ck, dx, (o,K,bK,a' k,U k), (wifi^If) ar ^ asymptotically independent, 
with explicit covariance matrix given in Appendix A. 2. 

Let us stress that, before studying in detail this inference problem, it 
was difficult to assert the classical properties stated in Proposition 3.1. 
Adding immigration could lead to non identifiability or estimating prob- 
lems for m and u. Moreover, maximum likelihood estimators, for multitype 
branching processes, are based on the observations of Gij(k), i.e. offspring 
of type j from type i parents (see Guttorp, 1991; Gonzalez et al., 2008; 
Maaouia and Touati, 2005). We did not require this information for the in- 
ference and just used the counts of individuals of each type in successive 
generations. Hence, getting identifiability and consistency for the param- 
eters is the only difficulty here. Other properties are classical but requires 
using the exact structure of the data provided the regularity of the statistical 
model. 
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Remark 3.2. Estimating additional moments of G(.) and //(.) can be 
performed similarly using other functionals than Conditional Least Squares 
(see Winnicki, 1991 for the variance estimation of G{.) and ). 

The proof is given in Appendix A. 2. 

4. Incomplete model study in the Poisson case. The set-up is now 
different: only Ri,Vi, Fi are observed while Si and Tj are unobserved. Clearly, 
algorithms simulating the missing data given the model and the parameters 
at each step can be used to get estimation. This approach is complementary 
to our concern that aims at understanding which mechanisms can be esti- 
mated. For this, we have to study the process (Ri, Vi,Fi) for i = 0, . . . , n. 
It is a discrete time stochastic process, which is no longer Markov : the 
distribution of (Ri+i, Vi+i, -Fj+i) given the past now depends on the whole 
past and not only on (Ri,Vi, Fi). This appears explicitely later on. This 
is similar to problems encountered when studying Hidden Markov Models 
(Genon-Catalot et al., 2003; Genon-Catalot and Laredo, 2006; Cappe et al., 
2005). For a first approach, we restrict our attention to a very informative 
example, the case of Poisson distributions, which leads to explicit computa- 
tions. 

4.1. Probabilistic properties in the Poisson case. Let us specify all the 
distributions appearing in the populations dynamics. 

Assumption 5. The offspring distribution G(.) is Poisson V(m), the 
immigration distribution fi(.) is Poisson V(u). 

Assumption 6. The variables Sq andTo are independent and distributed 
according to Poisson distributions: So ~ V(a) and Tq ~ V(t). 

For Poisson distributions, we denote V(\)(k) = ¥(X = k). Recall a prop- 
erty of Multinomial and Poisson distributions. 

Lemma 4.1. Assume that N is a random variable distributed accord- 
ing to a Poisson distribution 'P(A) and that X = (X\, X2, ■ ■ ■ , X{) is a 
l-dimensional random variable such that the conditional distribution of X 
given N is a Multinomial distribution A4(N; a±, . . . , ai) with Ya=i a i = !• 
Then, the random variables {Xi,i = 1,...,/} are independent and verify 
X i ^V{a l \). 

First consider one population and omit the index k in what follows. Set 



(4.1) 



Y i = (R i ,V i ,F i ) and & = a(Yj, j = 0, . . . 
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Clearly, Qi is the information available up to time i. To state the main 
result of this section, define the three sequences of Gi-i measurable random 
variables: 

(4.2) r = a ; r' = r ; 

(4.3) for i > 1, Ti = aTi_\ + a'T' i _ 1 ; T[ = mFi_\ + u. 

(4.4) for i > 0, Ai = bTi + &T-. 
Then, the following holds : 

Theorem 4.1. Under Assumptions (A1)-(A6), the initial distribution 
^o(y) °f{Yi)> and the conditional distribution C(Yi + \/Qi) satisfy, using (4-1) 
and definitions (2. 13), (44), for y = (r,v,f) G N 3 

(4.5) P(Y = (r ,v ,fo)) = Mvo) = V(Ao)(r ) P4 (v /r )p 5 (fo/vo), 

(4.6) P(y i+ i = (r,vJ)/Gi) = V(A i+1 )(r) P4 (v/r) P5 (f/v). 

The explicit dependence of Ri on the whole past (Yq, . . . ,Y^i) appears 
more simply with the following expression for Aj, 

(4.7) A = ba + b'r = c (9) ; 

(4.8) A 1 = b'mF Q + c 1 (0) with ci(0) = aba + a'br + b'u, 

(4.9) Ai = b'mFi-x + a'bm(F^ 2 + aF^ 3 + ■■■ + a l ~ 2 F ) + a{6), with 

1 — a*" 1 

(4.10) ci(0) = a % bo + a^V&r + a'fttt + b'u for i > 2. 

1 — a 

The result of Theorem 4.1 is a consequence of the proposition stated 
below. 

Proposition 4.1. Under Assumptions (Al)-(A6), the random variables 
Si+x, Tj + i are conditionally independent given Qi, and their conditional dis- 
tributions satisfy, using the random variables Tj and defined in (4-2), 
(4-3), 

(4.11) C{S l+l /Gi) ~ V{T l+l ) and £(T i+1 /&) - 

Let us prove Theorem 4.1 assuming Proposition 4.1. We just have to 
check the expression of the conditional distribution of R4+1. By (2.5), we 
have Ri+i = R'i + i + R'- +1 . Using Proposition 2.1 and (2.12), the distribution 
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of Ri+i conditionnally on (5j+i,Tj + i) is equal to ps(r/Si+i, li+i). Applying 
Proposition 4.1, Si+i and Tj+i are conditionally independent given Qi and 
distributed according to two independent Poisson distributions. Hence, an- 
other application of Lemma 4.1 yields that the conditional distribution of 
R i+1 given Qi is V{bT i+l + b'T' i+1 ) = V{A i+1 ), which is (4.4). 
The proof of Proposition 4.1 is given in Appendix A. 3. 

4.2. Likelihood of the incomplete observations. The inference is now based 
on the observations of (Yl) recorded up to time n for K independent pop- 
ulations. Denote by Y k = (R k ,V k , F k ) the process describing its dynamics 
in population k and y k = (rf,vf,f k ) the observations at time i. We set, 

(4.12) Y k n = (Y k ,...,Y k ), and Y 0:n (K) = (Y ) n , . . . , Y$ n ). 
Observations up to time n are denoted 

(4.13) O k , n = (y k ,y k , ...,y k n ) and d 0:n (K) = . . . , 6g n ). 

Let us first compute the likelihood for one population, population k. Suc- 
cessive conditionnings yield 

(4.14) L(9; OU = P 9 (Y k = y k ) f[ P 9 (Y k = y k jY k l _ 1 = y k :i _ x ). 

i=i 

Contrary to the previous section, each term of this product depends on i 
and on the observations up to time i — 1. Theorem 4.1 gives the expression 
of these conditional probabilities. 

Since the random variables Aj now depend on 9 and on the past, we define 
Aj(0) = Aj(0; Yo:«-i)> arid Ior population k, 

(4.15) A*(6>) = AifrY*^) , X k (9) = HOwL-i)- 

(4.16) P e (Y k = y^/tL-i) = P(A?(*))(r?) »(q«?/rf) P^d;f k /v k ). 

Joining the observations in the K populations, the likelihood writes, using 
notations (4.12), (4.13), (4.16), 

K 

(4.17) L(6;6 .. n (K)) = l[L(e;d k :n ). 

k=l 

The log-likelihood splits into three terms, 

3 

(4.18) 1(9, :n(K)) = log L(9; 6 0:n (K)) = ]T V K {9, 0:n (K)), 

i=l 
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where, using Assumptions (A5), (A6) and Theorem 4.1, 

K n 

(4.19) lk(6) = q(9,O 0m (K)) ^^logn^frf), 

k=l i=0 

(4.20) i£(e, d 0:n (K)) = i 2 K ( c , o 0:n ) ; ike, 6 0m (k)) = i%(d, o 0:n ). 

Hence, estimating parameters c, d is exactly the same as in the previous 
section; their inference is omitted in the sequel. 

4.3. Parametric inference. It remains to study the estimation of the pa- 
rameter 6 = (a,r,a,b,a' ,b' ,u,m). As before, let 9q be true value of the 
parameter. 

We first have to investigate which parameters are identifiable when only 
these incomplete observations are available. By identifiability, we mean here 
identifiability of a statistical model A4 = (P$,0 € 0) : 

V0, 6' £ 6, {F e = P /} =► {6 = 6'}. 

Recall that n denotes the time index of a plant lifecycle and that we 
consider populations recorded up to time n (n = corresponding here to 
the observation of one complete cycle). Using now the definitions of the 
terms q((9) given in (4.7), the following holds : 

Theorem 4.2. Assume (Al), (A2), (A3), (A4), (A5), (A6). Then, 

1. if n = 0, only cq{6) = bcr + b'r is identifiable; 

2. if n = 1, (b'm,co(0),ci(O)) is identifiable; 

3. ifn = 2, (^,b'm,co(e),c 1 (e),c 2 (e)), is identifiable; 

4- if n > 3, and a ^ ^Jl, then (j) = (a, b'm, b'u, ba, b'r) is identifiable; 
if n > 3 and a = ^r, then (a = b'm, b'u, ba + b'r). 

Note that identifiability of additional parameters cannot be gained in- 
creasing n beyond 3. Larger values of n result in improving the asymptotic 
variance for the estimation of <j>. 

Remark 4.1. Stating the above theorem for the first values of n is un- 
usual. However, in the field survey of feral oilseed rape populations, observa- 
tions unfortunately had been collected up to n = 2, leading to the unability of 
estimating a, the annual survival rate in the seed bank, which is a parameter 
of much concern in Ecology. 
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Assume now that n > 3, a / y and let us study the inference of the 
identifiable parameters. Let us denote by = (a, b'm, b'u, ha, 6V) (resp. 
4>q ) an arbitrary (resp. the true value) of the paramete, by 3> the parameter 
set and asssume 

Assumption 7. & is a compact set o/R+ x (0, 1); and 4>o E 

Using (4.19), define the maximum likelihood estimator 4>k as a solution of 

(4.21) U (4> K ) = sup{&(0) -A E $}. 

Under Assumption (A7), the function — ► J^q (A$(0; lo:i-i) — log Aj(0, Yo:i-i)) 
is a.s. P<£ twice differentiable on and we can define, for <j> = (0i, . . . , ^>q), 
the 6x6 matrix, 

1 Wm = 1^ E Oo( x 7T^ o, o7 ) with 1 < g < 6. 

V^ Ai(0) <9</>p d(j> q 

Proposition 4.2. Assume (Al)-(AS), (A5), (A6) and (A7). Then 4> K 
is strongly consistent. If moreover the matrix I(<po) is invertible, then 

Vk(4>k ~ 4>o) % A^O,/^)" 1 ) under as K -> +oo. 

Remark 4.2. The matrix I{4>q) that appears in the asymptotic variance 
of $K can be estimated, using the explicit expressions for the derivatives of 
Aj ((/>), by the empirical estimate for 1 < p, q < 6, 



f iff 1 9X^$ K )d\^ 



The proofs of Theorem 4.2 and Proposition 4.2 are given in Appendix 
A.4.A.5. 

5. Simulation study. In the case of incomplete observations, the es- 
timators strongly rely on the assumption that both the offspring and the 
immigration distributions are Poisson distributions. We investigate in this 
section how the estimation behaves when these distributions are no longer 
Poisson. 
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5.1. Methods. We considered in these simulations that the offspring dis- 
tribution G7(.) or the immigration distribution are more dispersed than 
Poisson distributions. Two series of simulations were performed: the first 
one concerns deviations to a Poisson distribution of the offspring 6?(.), and 
the other one for the immigration For this, we used Negative Binomial 
distributions for G7(.) (resp. //(.)) with mean m (resp. u) and increasing val- 
ues of the variance variance 5 2 (resp. p 2 ); the (variance /mean) ratio ranged 
from 2 to 1000. For each given set of parameters, we performed M = 100 
repetitions including K = 300 populations. These populations dynamics 
were run during n = 4 years to get the observations for rosettes Rq to R4 
and mature plants (Fq to -F3), using biologically plausible values for demo- 
graphic parameters. Indeed, for a,b,a',b', we used the parameters given in 
Claessen et al., 2005a, b : 

(5.1) a = 0.15 , a' = 0.006 , b = 0.5 , b' = 0.5. 

We also had to fix some values for c,d,m = K(G),u = E(/i). We used values 
estimated in Gamier et al., 2008 : 

(5.2) c = 0.21 , d = 0.01 , m = 13 , u = 80. 

The value m = 13 corresponds to the mean fecundity of plants mown twice. 
(Colbach et al., 2001). The value u = 80 corresponds to average immigration 
when there is no cultivated field in the neighbourhood (Gamier et al., 2008). 
Finally, for the initial distributions of So, To, we assumed 

(5.3) S ~ V(a),T ~ V(r) with a = r = 50. 

For each value of (variance/mean) ratio, the inference of the identifiable 
parameters (see Theorem 4.2) in the case n > 3. was computed on each of 
the M = 100 repetitions of the K = 300 populations trajectories. Mean 
and standard deviation of these estimates were then empirically estimated 
from the M = 100 values obtained. Simulations and statistical analyses were 
performed using R-8-0 (R Development Core Team, 2005). 

5.2. Results. Some technical difficulties occured when we tried to esti- 
mate jointly the six identifiable parameters (a, ^,b'm,b'u,ba,b'T): the al- 
gorithm we used often did not converge because of the non-linearity in the 
statistical model. Since we were mainly interested in the estimations for G(.) 
and //(.), we assumed that the quantities a and were known. With this 
simplification, we just had to deal with a linear statistical model; we re- 
stricted our attention to the estimation of the parameters (b'm, b'u, ba, b'r). 
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b'm 


= 6.5 


b'u = 


40 


b'a = 


25 


6'r = 


25 


S 2 /m 


est. 


sd 


est. 


sd 


est. 


sd 


est. 


sd 


2 


6.44 


0.77 


40.02 


0.21 


25.19 


3.28 


24.88 


3.27 


5 


6.46 


0.61 


40.05 


0.24 


24.46 


3.34 


25.52 


3.36 


10 


6.65 


0.84 


39.96 


0.23 


25.18 


2.93 


24.84 


2.95 


50 


6.78 


1.40 


39.97 


0.27 


25.10 


3.73 


24.88 


3.75 


100 


6.56 


1.89 


39.99 


0.25 


25.24 


3.63 


24.73 


3.64 


500 


6.26 


3.30 


40.01 


0.32 


24.45 


6.66 


25.50 


6.69 


1000 


6.41 


5.41 


40.003 


0.38 


25.54 


7.75 


24.37 


7.78 



Table 1 

Mean (est.) and standard deviation (sd) of estimators when the offspring G(.) is a 
binomial distribution with mean m and variance 8 2 instead of a V(m) distribution. 
Immigration fi ~ V(u); a = 0.16, a' = 0.006, b — b' — 0.5, m = 13, u = 80, a = r = 50. 



Value 


b'm 


= 6.5 


b'u = 


40 


b'a 


= 25 


b'r 


= 25 


p 2 /u 


est. 


sd 


est. 


sd 


est. 


sd 


est. 


sd 


2 


6.51 


0.77 


40.01 


0.24 


24.61 


3.81 


25.36 


3.76 


5 


6.45 


0.98 


40.01 


0.37 


24.33 


5.19 


25.67 


5.28 


10 


6.61 


1.47 


40.03 


0.52 


24.78 


7.25 


25.17 


7.25 


50 


6.59 


2.63 


39.93 


0.87 


25.60 


14.89 


24.46 


14.93 


100 


7.42 


3.43 


39.95 


1.38 


28.60 


22.42 


21.39 


22.40 


500 


7.61 


7.26 


40.05 


3.13 


21.96 


39.31 


28.06 


39.31 


1000 


6.48 


7.63 


39.62 


4.23 


21.77 


64.75 


28.23 


64.76 



Table 2 

Mean (est.) and standard deviation (sd) of estimators when the immigration is a 
negative binomial distribution with mean u and variance p 2 instead of a V(u) ditribution. 
Offspring G(.) ~ P(m); a = 0.16, a = 0.006, 6 = 6' = 0.5, m = 13, u = 80, a = r = 50. 
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The results are given in the two tables below. 

Concerning deviations of G(.) from the distribution V(m), we obtained 
that the estimation procedure performed very well, even for large deviations 
from the Poisson case : the bias remained less than 5% for all the four 
estimated quantities with a variance/ratio up to 1000. (see Table 5.2) When 
immigration was assumed to follow a Negative Binomial distribution, the 
estimation procedure performed quite well for values of (variance/mean) 
ratios up to 50 : the biases remained less than 10% for all four identifiable 
quantities. For larger values of variance/mean ratio, the bias could 17% of 
parameter value approximately. 



APPENDIX A: APPENDIX SECTION 

A.l. Proof of Proposition 2.1 . Consider first the initial distribution 
7ro(x). Successive conditionings yield 

ir (x) = F(S = s,T = t)F(R = r/s, t)F(V = v/s, t, r)F(F = f/s, t, r, v). 

The first distribution is v(s,t). Using definitions (2.4), (2.13), the last two 
conditional distributions are equal to p±(v/r) and p*>(f /v). The remaining 
distribution in 7r (:r) is C(Rq/Sq, T ). Using (A2) and (2.5), R = R' Q + Rq, 
where R' and Rq are independent. Now, the distribution of (S[,R' Q ) (resp. 
(S",Rq)) is the Multinomial distribution M(S ;a,b) (resp. M(T ;a',b')), 
leading to Binomial distributions B(So;b) for the marginal distribution of 
R'q (resp. B(To;b') for Rq ). By (A2), these two distributions are inde- 
pendent conditionally on (So, To) which yields (2.12) and the expression 
of no(x). Let us now study two successive generations. Using notations 
(2.7), the conditional distribution of Xj+i given T% can be expressed for 
x = (s,t,r,v,f),x' = (s',t',r',v',f), 

F(X i+1 = x'/Ti) = F(S i+l = s'/Fi) x F(T i+l = t'/s';^)* 
F(Ri +1 = r'/s',t';F l )xF(V t+1 = v' /s' ,t' ,r';^)xF(F i+l = f'/s', t' ,r' ,v';fi). 

The last three conditional distributions have already been computed for get- 
ting 7To(x). They are respectively equal top^(r' / s' , t'), p^(v' jr') andp^(f /v') 
defined in (2.12), (2.13). Let us compute F(T + i = t'/s'; Ti). Seeds on the 
ground at cycle (i + 1) come from two sources, offspring of mature plants Fi 
and seed immigration during cycle i + T i+1 = T/ +1 Using (Al), the 

distribution of T/ +1 is G* F ' . By (A3), is independent of Si + \,T' i+l and 
J=i, so that F(T l+1 = t'/s';Fi) = (G*f * p,)(t') = p 2 (t'/f). The last distribu- 
tion is F(S i+1 = s'/Ti). Using (A2) and (2.5) yields that S i+l = S' i+1 + 5f +1 
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where conditionally on (S^Tj), (S' i / +1 ,^) and (S'/ +1 ,R'-) are distributed 
according to two independent Multinomial distributions A4(Si] a,b) and 
M.{Ti\a! ,b'). Hence, the marginal distribution of 5j+i is, 

nb l+1 - S /A) - {B{S ^ b) ^ B{T ^ v)){r) - Pl{S /S, t, T). 

Joining these results yields that {Xi) is a time homogeneous Markov chain 
with state space N 5 . 

A. 2. Proof of Proposition 3.1. We use in the sequel the Kullback- 
Leibler divergence 1C(P,Q) of distribution Q w.r.t. P. Recall its definition, 

(A.l) £(P, Q) = - J log ^ dP = -Ep(^) if Q < P ; 

(A.2) JC{P, Q) = +oo otherwise. 

This quantity is non-negative and equal to if and only if Q = P P a.s. 

Let us first consider l 2 K {c) and l K (cQ defined in (3.9), (3.10). The maximum 
likelihood estimators are 

K \-^n „,fc Y^A" x^n fk 



k=l 2^i=0 v i . 3 _ 2^k=l 2^i=0 J; 



(A.3) c K = k =i ^ =u ) ; d K - 

2^k=i 2^i=o r i 2^k=i 2^i=o v i 

Since the K populations are independent, applying the strong law of large 
numbers yields the strong consistency of &k and dx- The random variables 
Z\ = Y^i=${Vt~ c oR-i) are i-i-d- centered with variance co(l— co)E6> (Xr=o 
so that the Central Limit Theorem yields that, 



(A.4) as K +oo , Vk(c k - c ) Z A/(0, C ° ( * % ) under Fg . 

The proof concerning dk is similar : dx — > do ¥g a.s. and 

(A.5) VK(d K -d )^M(0, ¥ d f- do h . 

Consider now the estimation of (a,b,a' ,&'). Let us first check identifia- 
bility. Applying the strong law of large numbers to l%{9;Oo :n ) defined in 
(3.13), we get that, under Pg , as K — > +oo, 

^(0; X 0:n (K)) - E„ £ log(A^(5 4 ; a, 6) *JW(r i; a', &'))(£+!, 1*). 

i=0 
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We can express the limit above using the Kullback-Leibler divergence defined 
in (A.l), 

n 

C(0o)-E eo ^/C(M(5 i ;ao,6o)*M(^ ^ ;a / o ,^),M(5 i ;a,6)*M(^ i ;a^6 , )), 

i=0 

where C(6q) is a constant depending only on 9q, and /C(P, Q) is the Kullback- 
Leibler divergence of the two random probability distributions. Each term 
of the sum above is non positive and equal to if and only if the two distri- 
butions are identical a.s. under P# . It is easy to check, using the first and 
second moments of these two distributions, that this implies (a,b,a' ,b') = 
(do, bo, a' , b'o). Hence, (oq, bo, a' , b' ) is a strict maximum of the limit above, 
which leads to the identifiability of these parameters. 

Consider now the CLS estimators (og-, a'jf) which minimize Jj^(a,a') de- 
fined in (3.14). By the strong law of large numbers, under Fg , as K — > oo, 

(A.6) ^Jjc(a,a') ^ f^E eo ({(a - a)5, + (a' - a')T,tf) + A(8 ), 

i=0 

where A(6 ) = a (l - a )Ee (£?=o 5<) + a' (l - a o )E0 o (Y% =o ~M.e o T i ) only 
depends on #o- Since Si and Tj are non negative random variables, this limit 
possesses a strict minimum at (a, a') = (ao,a' ). 

Denote by T M the transposition of a matrix M and set Z the (n x 2) 
matrix with rows equal to (<Sj,Tj), T 5 the vector (So> . . . ,«S* n ), r 5 the vec- 
tor (Si, . . . , S n+ i), and Z k ,S k ,S k ,T k their values for population k. Then 
Jx(a, a') writes, 




(A.7) so that (V) = (E ^ z *) YE 

As functions of (a, a'), ^J^(a,a') and its limit defined in (A.6) are twice 
continuously differentiable a.s.. The parameter set is compact, so we just 
have to control the continuity modulus of the process ^J^(a, a'). It is de- 
fined, for r] > 0, by 

w(K,rj) = sup{—\j} c (a 1 ,a' 1 ) - Jjc(a 2 ,a' 2 )\ ; || (01,01) - (a 2 ,a' 2 )\\ < rj}. 
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By the Cauchy-Schwarz inequality, w(K,rj) is bounded by 

E( 25 m-(«i+ a 2)^-(a' 1 +a' 2 )^) 2 } 1 /2 { }_ ^ ((ai _ a2)j sf+(a' 1 -a' 2 )^) 2 } 1 /2. 

The first term is a random variable converging P# -a.s. to a determinis- 
tic positive limit, and is thus bounded in probability. The second term is 
bounded by the r.v. 2r/(£ £f=i £? =0 (Sf + T^)) 1 / 2 - 2r? (£, £?=o(^ + 
T/)) 1 / 2 Pfii a.s. Hence, as K — > +00, limsup w(K,i]) < 4>(rj), where ^(77) — ► 
as ?? — ► 0. This ensures the consistency of (ciK,a' K ) defined in (A. 7) 
(Dacunha-Castelle and Duflo, 1993; Van der Vaart, 1998). 

Consider now the asymptotic normality. The function (a, a 1 ) — > J^(a,a') 
being C 2 a.s., the gradient of J^(a,o') is 

DJ 1 K (a,a') = -^Z k (s k -Z k (*j 

The 2x2 matrix containing the second derivatives of jj^(a,a') is 

K 

Vj£-(a,a') =2^ T Z k Z k . 

k=l 

Using now that (aK,a' K ) is consistent, a Taylor expansion of D J\ at (ao, a'o) 
yields 

(A.8) = _L^(a ,a' ) + -^=VJ^ H" ^) + o P (l), 

2VA 2VA V ~~ °/ 

where op(l) denotes a remainder term that goes to in Pg -probability. 
The r.v. vectors — Z k ^^jj are i-i-cL. centered, hence ^^DJj((ao, a' ) 

converges in distribution under P# to a centered Normal distribution with 
covariance matrix Eg ( T ZV(9q)Z) where V{9q) is the (n+1) x (n+1) diagonal 
matrix with diagonal elements oo(l — eto)5j + a' (l — a' )Tj. Moreover, by the 
strong law of large numbers, 2K^7Jjc converges a.s. to Eg ( T ZZ). By the 
Cauchy-Schwarz inequality, this matrix is invertible and using (A.8) yields 



EgXZZ^j T Z k {S k -Z k h)) + o P (l). 

Therefore, setting Ei(0 o ) = {E 6o { T ZZ))- l E 6o { T ZV{e Q )Z){E eQ { T ZZ))' 1 , 



a-K — 00 
K ~ u o, 



\a*r — a 



(A.9) VK[ a J- a j)^mM0o)). 

\ a K ~ a 0/ 
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Similarly, setting T R the vector (Rq, . . . , R n ), define 
(A.10) (\A = (J2 T Z k ^')"Ye T Z k R k ). 

\ b K/ \ k=1 / \fc=i / 

The proof concerning the estimation of (b,b') is similar; so we get, setting 
W(9q) diagonal matrix with diagonal elements &o(l ~~ bo) Si + b' (l — b' )Ti 
and £ 2 (fl ) = (Ee ( T ZZ))- 1 E eo ( T ZW(e )Z)(Ee ( T ZZ))-\ 

(A.ll) V ^fel^)^* Ar(0 ' S2 ^ 0)) ' 

Finally, let us study the estimation of (to, u) based on the CLS process 
J^(m,u) defined in (3.16). Under the assumptions (Al), (A3), G(.) and p(.) 
have finite variances 5 2 and p 2 . Denote by 5q and /?q the variances associated 
with 6*o- Then, as K — > co, under ¥g Q , 

, n n 

-J^{m,u) -+ Ee^i^F, + p 2 )) + ^ (^[(to - m)F, + (u - u )] 2 ). 

Clearly, the above functional has a strict minimum at (too,«o)i leading to 
the identifiability of (to, u). Let T F denote the vector (Fq, . . . , F n ), G the 
(n + 1) x 2 matrix with rows equal to (Fj, 1), T T the vector (T±, . . . , T n +i) } 
F k ,G k ,T k their values in population k. Then, the conditional least square 
estimator is 

(A.12) = (jr T c k c k ^j 1 (jr, T c k f k y 

Consistency and asymptotic normality are obtained with a proof similar to 
the one detailed above. Define W'{9q) the diagonal matrix with diagonal 
elements S^ + p 2 , and E 3 (0 O ) = ( E Oo ( r GG) ) ~ 1 Eg ( T GW' G) (Eg ( T GG) ) ~ 1 , 
then 

(A.13) VK \ mK ~ m ° | -P AA(0, E 3 (0 O )) in distribution under P fln . 
yu K -u J 

Using now that the likelihood splits into five terms that can be maximized 
separately leads to the asymptotic independence of the estimators stated in 
Proposition 3.1. 
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A. 3. Proof of Proposition 4.1. Let us prove by induction Proposition 
4.1. By Assumption A6, the property holds for i = : So and To are indepen- 
dent and So ~ Via) = V(T ), T ~ V(t) = V(T' Q ). Assume that the prop- 
erty holds for i > 1, i.e. using definitions (4.1), (4.3), Si and Tj are indepen- 
dent conditionnally on Gi-i and C(Si/Gi-i) = V(Ti), £(Ti/Gi-i) = ViT'j). 
Using (2.5), (S i+1 ,Ri) = + (S" +1 ,R"), where the conditional dis- 

tribution of (5 , ,- +1 , R'j) (resp.(5^ l5 R")) given (Si,Ti) is the Multinomial 
distribution A4(5j;a, 6) (resp. M.(T i ;a! ,V)). Using Assumption (A2), these 
two distributions are independent conditionally on Si, Ti and applying now 
Lemma 4.1 conditionally on Gi-i to {Si ~ V(Ti), M(Si;a,b)} and to {Ti ~ 
Vir'i), A4(Tf, a', {/)} yields that the four variables S' i+1 , R[, S'J +l , R[ are in- 
dependent conditionally on Gi-i and that S' i+1 ~ 'P(aLj), S" +1 ~ 'P(a / r / i _ 1 ), 
-R^ ~ V(bTi) and i?" ~ 7 ? (6T^). Hence, SVfi and are independent condi- 
tionally on Gi-i and S i+1 ~ 7>(aTj + aTJ), and ifc ~ ^(6^ + &TJ). 

Let us now prove that E{Si + i/Gi) = E(Si+i/Gi-i)- Let cfii and ^2 two 
measurable functions of (Yo:i-ij <Si+i) an d {Yo-i-i, Ri,Vi, Fi) and compute, 

E s ^{<j )1 (Si + i)(t> 2 (R i ,Vi,F i ))=E d ^i J Msi+Mn^u^dsi+nindvidfi). 
Using (2.13) and (2.14) , set ip2(i*) = J 4>2{r, v, f)p4(v/r)ps(f /v)dv, then 

= E G '-i(Ms l+ iW G '~HMR l )) = ^ Gl - 1 (Ms i+ i))^ 

since S'i+1 and Ri are independent conditionally on Gi-i- Hence, Sj+i and 
(Ri, Vi,Fi) are independent conditionally on Gi-i and, 

E(S i+1 /Gi) = E(S i+1 /Gi-i) = ViaTi + a'T' i+1 ). 

Consider two measurable functions <f>%, 4>4, of (Yo-.i-, S«+i) and (Yo:ij ^i+i), then 
using (2.12), (2.14), Assumption (A2) and the conditional independence 
given of S i+ i and (i^U^Fj) yield, 

(A.14) E e <(c/> 3 (Si+i)MTi+i)) = V Bi {J iM-n+i)Mt')P2(t'/Fi)ds l+1 dt') 

= E g >(4> 3 (S l+l ))( J Mt')P2it/Fi)dt') = E g *(<f> 3 {Si+i))E ei (<f> 4 (Ti+i)). 

Therefore, conditionally on Gi Sj+i and Tj+i are independent and using now 
Assumption (A5), C{Ti + \/Gi) = VimFi + u). The property holds for i + 1 
with Tj + x = al^ + a'T^ and r^ +1 = mFi + u, which are the definitions given 
in (4.3). 
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A.4. Proof of Theorem 4.2. The likelihood l x K (9) defined in (4.19) 
sums up the available information associated with the incomplete model. 
An application of the strong law of large numbers yields, 

1 ~ n 

-l l K {9) - L(6 , 9) = E 0O J2(-Ai(0, Y 0:i -i) + R t log A*(0, F 0:i _i)). 

i=0 

According to Theorem 4.1, conditionally on the random variables Ri 

are Poisson distributions with parameter Aj(0o, Yo-.i-i) under ¥g . Hence, 

n 

L(0 , 9) = E da Y,{-Ai(0, r :i-i) + Ai(0 o , *ch-i) log A(0, *0:*-i))- 

i=0 

Using the explicit form of the Kullback-Leibler divergence between two Pois- 
son distributions, L(9q,9) writes, 

n 

L(9 , 9) = -J2 E 0o {JC(V{Ai(Oo, Y ..i-i)),V(Ai(9, y :*-i)))} + C(9 ), 

with C(9 o )=Me o Y l i= o (-H0o,Y o .. i _ 1 )+A i {9o,Y O:i _ 1 ) log A^qTch-i)) only 
depends on the observations. The identifiability condition for 9q is therefore 
equivalent to 

{L(9 ,9) = 0}^{9 = 9 }. 

The Kullback-Leibler divergence of two Poisson distributions fC(V(/io),V(fi)) 
is non negative and equal to if and only if f/, = /j,q. Hence, the limit L(9q,9) 
presents a strict maximum at 9q if and only if 

Ai(9,Y ..i-i) = Ai(0 o ,*b:i-i)P0 o - a. s. for i = 0,...,n. 

Since the Aj(0, Yo:«-i) depend on (i^-i, . . . , Fq), the above condition can 
hold only if the coefficients associated with the random variables Fi in the 
above expression are equal. The proof below is just elementary algebra based 
on this property. We obtain, using for the Cj(0) the definitions given in (4.7) 
and (4.15) : 

If n = 0, Ao(#) = cq(9) is deterministic. Only cq{9) is identifiable: it is the 
first condition stated in Theorem 4.2. 

If n = 1, we have, Ai(0,Yo:i-i) = b'mFo + c\{9). Since Fq is random, 
the two random variables Ai(0, Yb:i— l) an d Ai(#o, io:i— l) are Pe -a.s. equal 
iff b'm = b'^rriQ and c\{9) = c\(9q), which leads to the identifiability of 
(b'm,c (9), Cl (9)). 

If n = 2, A 2 (0) = b'm F 1 + ^b'mF Q + c 2 (0), We thus get two additional 
conditions which lead to the identifiability of (^2, b'm, cq(0), ci(0), C2(0)). 
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If n = 3, A 3 (0) = b'mF 2 + ^b'm(aF 1 + F ) + c 3 (0). Hence, F ,F 1 ,F 2 
being random variables, we get that (a, ^r, fo'm) are identifiable. Now, iden- 
tifying (6c7, b'r, b'u) consists in solving a linear system using the conditions 
on cq(6),ci(6),c 2 {6)). We obtain : 

- if {a ^ 2^}, (a, ^, &'m, ba, b'r) is identifiable. 

- if a = the identifiable parameters are (a, 6'm, b'u, ba + 6'r). 
Noting that, for i > 1, Cj + i(#) — aci{6) = (1 — a + ^-)b'u and that only 

(a,b'm, ^) enters in the Fj's coefficients, it can be checked that observing 
more generations does not lead to the idenfiability of additional parameters. 

A. 5. Proof of Proposition 4.2. Let <j) = (a, b'm, b'u, ba, b'r) with 
a / ^r- Under Asssumption (A7), we can define K\,K 2 ,a\,a 2 such that 

$ C [K\, K 2 f x [01,02], with < Ki < K 2 < +00 and < ai < a 2 < 1. 

The normalized loglikelihood process can be written as 

1 1 K 

(A.15) Y 0:n (K)) = - J2 m YL) with 

k=l 

n 

J(0, Y k n ) = ]T Rl log Ai(0, y * _j) - AH0, Fo^O. 

i=0 

These random variables are i.i.d. and we have to study the behaviour of 
their empirical distribution with respect to <j>. For getting the consistency 
of the associated maximum likelihood estimator, we have to prove that the 
parametric class {P^, <j> £ <&} is Glivenko-Cantelli (see e.g. Van der Vaart, 
1998). There is no close form for the density of these variables, so that no 
generic argument can here be applied: we thus propose a direct proof. 

The functional {<p — ► J((j>,Yo :n )} is a.s twice continuously differentiable 
on <E>. Let DAi(4>) denote the gradient in M 6 of A;(0, Yq-a-i). Then, 

I J(cf>';Y 0:n )-J(cf>";Y 0m ) \< (f^sup II (-r^rr + 1)^(0) \\) II 4>'-<t>" II • 

G * Ai(0) J 

Under (A7), we have that, for all i, Aj(0) > K\ > 0. Let us compute DAi(4>) 
Using (4.4) for the definitions of Cj(0) and A.i(0), let us set 

Ai(4>) = UmFi-x + ^(b'm)(Fi_ 2 + aF;_ 3 + ■■■ + a* -2 ^) + <*(</>). 

For i = 0, we have 

dA o (0) = dAo{4>) = Mq(^) = gAp(0) = M o (0) = dA (<f>) = 
da d{a'b/V) d(b'm) d(b'u) ' d{ba) d(b'r) 
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Using the convention that non properly defined terms are set to (e.g. the 
sum (-Fj_3 + 2aFj_4 + . . . ) is set to for i < 2, we get, for i > 1, 

~ -n-(b'm)(Fi- 3 + 2aF;_ 4 + ■■■ + (»- 2)a J ~ 3 Fo) + with 



aa o oa 

i-2 . 



dcii'P) /, v j-i a 'b /,/ c? 1 — a* 1 a '^/,/ 



9a 



+ _(^)_ ( ___) + _(6V)(i - l) a » 



- E/m^-a + ai^s + • • • + a'" 2 F ) + (ftV)^- 1 + b'u- ^ 



d{a'b/b>) v l ~* l ~° uy v ' I- a 

— = F 4 _x + - ^2 + aF 4 _! + • • • + a J - 2 F ; 
o {am) a 

dAi{<f>) a'bl-a*- 1 _ 

a(6'u) ~ + ~V 1-a ' 

9A<(0) 



a ; 



d(6'r) y ° ' 

All these partial derivatives are positive for i > 1 and, except the first one 
' they are bounded from above by M\(l + YH=oFj) where Mi is a 
constant determined by $ (since on $, < a\ < a < a2 < 1). Noting now 
that the application {i — > m'" 1 } satisfies Vi > 1, ai < za* _1 < «2 on 
[01,02] C (0,1) with < a\ < 02 < +00, we can also bound 9A q^ by 
Mg(l + Z)}=o-^j')' Joining these bounds, we get, 

1 

sup II (tt^ + i)^^) 11^ M 3(l + E F i) > 

n i— 1 

I J(0; Y 0:n ) " J(0'| l0:n) |< T]M 3 Z n with Z n = £(1 + £ Fj)Ri. 

i=0 j=0 

Using now that E^ (i?; E}=o ^j) = E 0o ( A ^o) E}=o *j)> Z n satisfies, 
E^Z n < nE^ (E( 1 + F i) 2 ) <™E( 1 + ~ 

N=0 ' i=0 ^ 

This is finite since G(.) and /u(.) have finite variance and n is prescribed. 
Let us define the r.v. = ELol 1 + £}=o F j) R i ■ Then the continuity 
modulus of i^-(^) verifies : 

1 1 x 1 K 

w{k, m = sup -El *cL) - Af; yL) I < ^ E ^- 

ll*'-*"ll<»7 K fc=l K fc=l 
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Using now that E^ Z n < oo, we can apply the strong law of large numbers 
to w(K,i], which is a sufficient condition for ensuring the consistency 

of 4> K . 

It remains to study the asymptotic normality of the estimators. It is easy 
to check that the random variables T W k = (W*, ■ ■ ■ , Wq ) with 

wk = f ^_ dAHcM 

p h d ^ 

are i.i.d. centered under P^ , with covariance matrix £ is for 1 < p, q < 6 

£ M = y^o) = E, ± |^(0o) |^(0o). 

This matrix is well-defined and finite is since, for all i, Aj(c^) > K\ > and 

I dt Q^ |< Ma(l + X)o _1 ^i) for p = 1, ... 6. Now, ^ is a zero for the score 
function, so that a Taylor expansion at 0o yields, using the consistency of 
the vector 4>k, 

Dl l K {el K ) = -A= J D^(0 o ) + Ifv^(^o)+i?K(^o,^V / F(^-^o) 



K "™ v 7 ^ A' 

The term V^(i^o) contains the second derivatives of ((f)) w.r.t. (j) p , <fi q and 
Rk^o^k) is the remainder term of the Taylor expansion. So, the strong 
law of large numbers yields : £V&(fc) - - E?=o ^(a^)^ 2 ]^ 1 )- 
The remainder term Rk{4>q, 4>k) is bounded uniformly on <I> by || 4>k — 4>o II 
Zk, with = sup{-^Vl 1 K ((/)), (/)£$>}. Using that Zk is bounded uniformly 
on $ by n 2 M 6 (l + £™ =0 FiFj) 2 , we get that Rr{4>o, 4>) goes to under P^ , 
which leads to the result stated in Proposition 4.2 provided that £(^o) is 
invertible. 
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